View Post [edit]
Poster: | stbalbach | Date: | Nov 30, 2009 11:19am |
Forum: | texts | Subject: | PDF's on Amazon Kindle |
This post was modified by stbalbach on 2009-11-30 19:19:13
Reply [edit]
Poster: | aibek | Date: | Dec 26, 2012 11:08pm |
Forum: | texts | Subject: | Re: PDF's on Amazon Kindle |
Does this problem still exist with Kindle? If yes, are you sure it is due to incompatible image types? If not, please see below.
I have found that some pdf files produced by IA do not follow the PDF specification, and that causes them to fail with some programs (e.g., in pdflatex with the error ‘invalid font in reference type ’). I will write a post about this on the forum, and the solution, in due time, but for now I am looking for other instances of the failure.
Could the problem in Kindle be due to the same bug? Can you please check? The pdf file from the below mentioned page (Lyon’s Grammar) fails with pdflatex with the error mentioned below. If you are not sure about the source of the Kindle problem, can you please check (i) that it fails with Kindle too, and (ii) the solution I propose works.
(ii): To correct -- which is a hack (for now) -- replace:
00006 0 obj
<< /F2 << /Type /Font /Subtype /Type1 /Encoding << /Differences 5 0 R >> /BaseFont /Times-Roman >>
/F2B << /Type /Font /Subtype /Type1 /Encoding << /Differences 5 0 R >> /BaseFont /Times-Bold >>
/F2I << /Type /Font /Subtype /Type1 /Encoding << /Differences 5 0 R >> /BaseFont /Times-Italic >>
/F3 << /Type /Font /Subtype /Type1 /Encoding << /Differences 5 0 R >> /BaseFont /Courier >>
>>
endobj
with:
00006 0 obj
<<>>
endobj
in the pdf file with a hexeditor or a good text editor. (Essentially, just delete everything between the two outer-most angle marks.) This process corrupts the xref table, but pdf readers should be able to digest the file anyway. (perhaps with a warning saying that “xref table is corrupted”.)
I have attached two versions of a page of the book. (It is enough to check the error with one page.) p.30-orig.pdf is the original page. p.30-edited.pdf is the page with the above mentioned required edit. There are no viruses in the files. You could check by just using these two files (70 KB each).
Thanks for your help.
http://archive.org/details/analysissevenpa00unkngoog
Attachment: p.30-orig.pdf
Attachment: p.30-edited.pdf
Reply [edit]
Poster: | stbalbach | Date: | Dec 28, 2012 11:40pm |
Forum: | texts | Subject: | Re: PDF's on Amazon Kindle |
Recall it's because the Kindle DX doesn't support layered PDF documents.
I downloaded and opened the p.30's and they both display correctly, perhaps because the original is not layered.
Try this one: http://archive.org/details/devisesetembleme00lafeu
On downloading the PDF and opening in the DX it says "some elements on this page can not be displayed" and every page is blank (ie. 100% elements can't be displayed). If you want, make changes mentioned above and I'll try it again (I don't have the tools to edit a binary file).
Stephen
Reply [edit]
Poster: | aibek | Date: | Dec 31, 2012 6:32am |
Forum: | texts | Subject: | Re: PDF's on Amazon Kindle |
Every image that we see in this is made up from 3 images: a background, a foreground and a mask. So the pdf really contains not 116, but 348 images. The foreground and background are in RGB, the mask is BW (1 bit per sample).
To see why this is useful, imagine brown text on yellow page. If such a 1000x1000 sample image is saved in RGB (3 bytes per sample), it would before compression take 3,000,000 bytes. In the IA method the image is decomposed into 3 parts: a brown 1000x1000 image, a yellow 1000x1000 image, and a BW 1000x1000 mask, and each of them is compressed and saved separately in the PDF. It is the reader’s work to join them together. '0' in the mask image means the reader is to compose the final image by using the corresponding pixel of the foreground image; '1' means that the reader is to use the corresponding pixel of the background image.
This helps as the three can be compressed much better. At an extreme, you can always imagine the brown and the yellow images taking just a few bytes each: 3 bytes for recording the colour, and a few bytes more for recording the dimensions of the image. Also, our most critical data is in the mask -- that image has to be the sharpest. But that has now 1 bit samples, and not the 27 bytes we earlier had.
There are a few more details. First, the background image is saved at a lower resolution, and the reader is asked to interpolate. So a little, if insignificant, loss in quality is creeping in the IA "compression". Second, the background and the foreground images are not limited to one colour -- they can be full-fledged RGB images too. The point is that the background image will fill surrounding colour in the place where the (foreground) objects are -- those parts will not be used anyway -- and thus will have a more or less uniform colour or gradient. And similarly, the foreground image will fill the nearby colours in the place in which it will not be read. This way the compression is much better. The critical stuff is (i) identifying the foreground and background properly, (ii) filling colours in areas which will not be read so that the image can be compressed best by the compression method of choice ('JPXDecode' for the RGB images in IA files). IA’s PDF files are produced by 'LuraDocument PDF v2.28'.
The compression is pretty significant. In the attached page 2 of the Devises et Emblemes file, the three images together take 42 KB. The dimensions of the composed image, however, are 2201x3063, so RGB (3 bytes per sample) would take 19 MB! This is the size of the file you will get when you join the three together in the intended manner.
So, most likely, Kindle is refusing to do this work. Please check for the text-layer issue too. The attached orig-p.2.pdf is the p. 2 of the book. The minus-textlayer.pdf is that page minus the text layer. (Both the files are of 40 KB size.) I am assuming that your Kindle can read neither of the files.
Reply [edit]
Poster: | aibek | Date: | Dec 31, 2012 6:38am |
Forum: | texts | Subject: | Re: PDF's on Amazon Kindle |
Attachment: orig-p.2.pdf
Attachment: minus-textlayer.pdf
Reply [edit]
Poster: | stbalbach | Date: | Jan 1, 2013 2:19pm |
Forum: | texts | Subject: | Re: PDF's on Amazon Kindle |
Reply [edit]
Poster: | aibek | Date: | Jan 5, 2013 7:52pm |
Forum: | texts | Subject: | Re: PDF's on Amazon Kindle |
Reply [edit]
Poster: | aibek | Date: | Dec 29, 2012 9:44pm |
Forum: | texts | Subject: | Re: PDF's on Amazon Kindle |
Let me investigate the properties of the Devices et Emblems file.
Reply [edit]
Poster: | aibek | Date: | Dec 26, 2012 11:57pm |
Forum: | texts | Subject: | Re: PDF's on Amazon Kindle |
Attachment: p.30-edited-xref-notfixed.pdf